Rank | Count | Beginning |
---|---|---|
250702 | 24162 | У |
27560 | 10255 | В |
142240 | 7642 | На |
84942 | 6433 | З |
87886 | 4428 | За |
187062 | 4359 | Після |
42829 | 3195 | Він |
74433 | 2920 | До |
282771 | 2764 | Це |
113299 | 2228 | Історія |
13487 | 2134 | Але |
154600 | 2119 | Населення |
72364 | 1962 | Для |
203084 | 1914 | При |
183645 | 1861 | Під |
238866 | 1678 | Також |
117379 | 1633 | Його |
51716 | 1373 | Вони |
128442 | 1285 | Крім |
167014 | 1207 | Однак |
22445 | 1200 | Біографія |
209042 | 1164 | Проте |
50537 | 1157 | Вона |
291307 | 1145 | Через |
297420 | 1117 | Як |
223822 | 1099 | Серед |
284101 | 1094 | Цей |
10477 | 1053 | А |
124202 | 926 | Коли |
298709 | 914 | Якщо |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV